Search CORE

87 research outputs found

ISR-WN: Integration of semantic resources based on WordNet

Author: Gutiérrez Yoan
Montoyo Andres
Vázquez Sonia
Publication venue
Publication date: 09/07/2015
Field of study

La presente herramienta informática constituye un software que es capaz concebir una red semántica con los siguientes recursos: WordNet versión 1.6 y 2.0, WordNet Affects versión 1.0 y 1.1, WordNet Domain versión 2.0, SUMO, Semantic Classes y Senti WordNet versión 3.0, todos integrados y relacionados en una única base de conocimiento. Utilizando estos recursos, ISR-WN cuenta con funcionalidades añadidas que permiten la exploración de dicha red de un modo simple aplicando funciones tanto como de recorrido como de búsquedas textuales. Mediante la interrogación de dicha red semántica es posible obtener información para enriquecer textos, como puede ser obtener las definiciones de aquellas palabras que son de uso común en determinados Dominios en general, dominios emocionales, y otras conceptualizaciones, además de conocer de un determinado sentido de una palabra su valoración proporcionada por el recurso SentiWordnet de positividad, negatividad y objetividad sentimental. Toda esta información puede ser utilizada en tareas de procesamiento del lenguaje natural como: • Desambiguación del Sentido de las Palabras, • Detección de la Polaridad Sentimental • Análisis Semántico y Léxico para la obtención de conceptos relevantes en una frase según el tipo de recurso implicado. Esta herramienta tiene como base el idioma inglés y se encuentra disponible como una aplicación de Windows la cual dispone de un archivo de instalación el cual despliega en el ordenador de residencia las librerías necesarias para su correcta utilización. Además de la interfaz de usuario ofrecida, esta herramienta puede ser utilizada como API (Application Programming Interface) por otras aplicaciones

Repositorio Institucional de la Universidad de Alicante

T2Know: An Advance Scientific-Tecnical Text Analysis Platform for Trend and Knowledge Extraction Using NLP Techniques

Author: Gutiérrez Yoan
Montoyo Andres
Muñoz Rafael
Publication venue: CEUR
Publication date: 23/10/2023
Field of study

The project T2Know presents the use of natural language processing technologies for the creation of a semantic platform of scientific documents via knowledge graphs. This knowledge graph will link relevant parts of each document with those of other documents in such a way that trend analysis and recommendations can be achieved. The goals addressed within the scope of this project include entity recognizers development, profile definition and documents linkage through the use of transformers technologies. As a result, the relevant parts of the documents to be extracted are related not only to the title and affiliation of the authors, but also to article topics such as references, which are also considered relevant parts of the scientific article.This project is funded by the Valencian Agency for Innovation through the project INNEST/2022/24, partially funded by the Generalitat Valenciana (Conselleria d’Educació, Investigació, Cultura i Esport) through the following projects NL4DISMIS: TLHs for an Equal and Accessible Inclusive Society (CIPROM/2021/021) and T2Know: Platform for advanced analysis of scientific-technical texts to extract trends and knowledge through NLP techniques. (Innest/2022/24). Moreover, it was backed by the work of two COST Actions: CA19134 - “Distributed Knowledge Graphs” and CA19142 - “Leading Platform for European Citizens, Industries, Academia, and Policymakers in Media Accessibility”

Repositorio Institucional de la Universidad de Alicante

Analysing the Twitter accounts of licensed Sports gambling operators in Spain: a space for responsible gambling?

Author: Gutiérrez Y. (Yoan)
Hernández-Ruiz A. (Alejandra)
Publication venue: Servicio de Publicaciones de la Universidad de Navarra
Publication date: 01/01/2021
Field of study

Apart from the economic impact of the online gambling industry, the social, public order and health-related consequences of the industry merit analysis to inform appropriate action, regulatory or otherwise. The omnipresence of ICTs, the inability to use technologies properly, along with the growth of online gambling channels, have acted simultaneously as a catalyst for the spread of pathological and problematic gambling. In this context, social networks have become a highly effective platform to instil positive attitudes towards the products of gambling operators. This work uses the Natural Language Processing based web application “GPLSI Social Analytics” to track, in real time, the conversations generated on Twitter about the Spanish domain accounts of the main online sports gambling operators. The findings indicate that most of the messages about these operators are positive and surprise is the predominant emotion associated with them. The notion of responsible online gambling barely receives a mention in the conversations analysed. Given the role of new technologies as access facilitators and potential enhancers of addictive behaviours, it is necessary to adopt measures directed at social networks that guarantee the coexistence of the right to freedom of expression with the protection of the most vulnerable populations

Repositorio Institucional de la Universidad de Alicante

Directory of Open Access Journals

Universidad de Navarra

Dadun, University of Navarra

A computational ecosystem to support eHealth Knowledge Discovery technologies in Spanish

Author: Almeida-Cruz Yudivian
Gutiérrez Yoan
Muñoz Rafael
Piad-Morffis Alejandro
Publication venue: 'Elsevier BV'
Publication date: 01/09/2020
Field of study

The massive amount of biomedical information published online requires the development of automatic knowledge discovery technologies to effectively make use of this available content. To foster and support this, the research community creates linguistic resources, such as annotated corpora, and designs shared evaluation campaigns and academic competitive challenges. This work describes an ecosystem that facilitates research and development in knowledge discovery in the biomedical domain, specifically in Spanish language. To this end, several resources are developed and shared with the research community, including a novel semantic annotation model, an annotated corpus of 1045 sentences, and computational resources to build and evaluate automatic knowledge discovery techniques. Furthermore, a research task is defined with objective evaluation criteria, and an online evaluation environment is setup and maintained, enabling researchers interested in this task to obtain immediate feedback and compare their results with the state-of-the-art. As a case study, we analyze the results of a competitive challenge based on these resources and provide guidelines for future research. The constructed ecosystem provides an effective learning and evaluation environment to encourage research in knowledge discovery in Spanish biomedical documents.This research has been partially supported by the University of Alicante and University of Havana, the Generalitat Valenciana (Conselleria d’Educació, Investigació, Cultura i Esport) and the Spanish Government through the projects SIIA (PROMETEO/2018/089, PROMETEU/2018/089) and LIVING-LANG (RTI2018-094653-B-C22)

Repositorio Institucional de la Universidad de Alicante

Automatic Discovery of Heterogeneous Machine Learning Pipelines: An Application to Natural Language Processing

Author: Almeida-Cruz Yudivian
Estévez-Velarde Suilan
Gutiérrez Yoan
Montoyo Andres
Publication venue: 'Association for Computational Linguistics (ACL)'
Publication date: 01/01/2020
Field of study

This paper presents AutoGOAL, a system for automatic machine learning (AutoML) that uses heterogeneous techniques. In contrast with existing AutoML approaches, our contribution can automatically build machine learning pipelines that combine techniques and algorithms from different frameworks, including shallow classifiers, natural language processing tools, and neural networks. We define the heterogeneous AutoML optimization problem as the search for the best sequence of algorithms that transforms specific input data into the desired output. This provides a novel theoretical and practical approach to AutoML. Our proposal is experimentally evaluated in diverse machine learning problems and compared with alternative approaches, showing that it is competitive with other AutoML alternatives in standard benchmarks. Furthermore, it can be applied to novel scenarios, such as several NLP tasks, where existing alternatives cannot be directly deployed. The system is freely available and includes in-built compatibility with a large number of popular machine learning frameworks, which makes our approach useful for solving practical problems with relative ease and effort.This research has been supported by a Carolina Foundation grant in agreement with University of Alicante and University of Havana. Moreover, it has also been partially funded by both aforementioned universities, the Generalitat Valenciana (Conselleria d’Educació, Investigació, Cultura i Esport) and the Spanish Government through the projects LIVING-LANG (RTI2018-094653-B-C22) and SIIA (PROMETEO/2018/089, PROMETEU/2018/089)

Repositorio Institucional de la Universidad de Alicante

Crossref

ElectionMap: a geolocalized representation of voting intentions to political parties based on twitter's user comments

Author: Agulló Antolín Francisco
Guillén Antonio
Gutiérrez Yoan
Martínez-Barco Patricio
Publication venue: Sociedad Española para el Procesamiento del Lenguaje Natural
Publication date: 01/01/2015
Field of study

ElectionMap es una aplicación web que realiza un seguimiento a los comentarios publicados en Twitter en relación a entidades que refieren a partidos políticos. Las opiniones de los usuarios sobre estas entidades son clasificadas según su valoración y posteriormente representadas en un mapa geográfico para conocer la aceptación social sobre agrupaciones políticas en las distintas regiones de la geografía española.ElectionMap is a web application that follows, in Twitter, entities previously established and related to the politics. The user's opinions about the entities are classified according to its valuation by using sentiment analysis processes. Afterwards the opinions are represented in a geographic map that allows to know the social acceptance of Spanish political parties in different geographical areas.ElectionMap es una aplicación web desarrollada por el Grupo de Procesamiento del Lenguaje Natural y Sistemas de Información (GPLSI) de la Universidad de Alicante. Esta aplicación ha sido parcialmente financiada por el Gobierno Español y la Comisión Europea a través de los proyectos: ATTOS (TIN2012-38536-C03-03), LEGOLANG (TIN2012-31224), SAM (FP7-611312) y FIRST (FP7-287607) y por la Universidad de Alicante a través del proyecto emergente “Explotación y tratamiento de la información disponible en Internet para la anotación y generación de textos adaptados al usuario” (GRE13-15)

Repositorio Institucional de la Universidad de Alicante

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Socialising around media. Improving the second screen experience through semantic analysis, context awareness and dynamic communities

Author: Aisopos Fotis
Badii Atta
Gutiérrez Yoan
Tiemann Marco
Tomás David
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2019
Field of study

SAM is a social media platform that enhances the experience of watching video content in a conventional living room setting, with a service that lets the viewer use a second screen (such as a smart phone) to interact with content, context and communities related to the main video content. This article describes three key functionalities used in the SAM platform in order to create an advanced interactive and social second screen experience for users: semantic analysis, context awareness and dynamic communities. Both dataset-based and end user evaluations of system functionalities are reported in order to determine the effectiveness and efficiency of the components directly involved and the platform as a whole

Repositorio Institucional de la Universidad de Alicante

Central Archive at the University of Reading

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Social Rankings: Visual Sentiment Analysis in Social Networks

Author: Fernández Martínez Javier
Gutiérrez Yoan
Gómez José M.
Martínez-Barco Patricio
Publication venue: Sociedad Española para el Procesamiento del Lenguaje Natural
Publication date: 01/01/2015
Field of study

Social Rankings es una aplicación web que realiza un seguimiento en tiempo real de entidades en las redes sociales. Detecta y analiza las opiniones sobre estas entidades utilizando técnicas de análisis de sentimientos para generar un informe visual de su valoración y su evolución en el tiempo.Social Rankings is a web application that follows different entities in the social networks in real time. It detects and analyses the opinions about these entities using sentiment analysis techniques, to generate a visual report of their reputation and evolution in time.Social Rankings ha sido desarrollada por el Grupo de Procesamiento del Lenguaje Natural y Sistemas de Información (GPLSI) de la Universidad de Alicante. Esta aplicación ha sido financiada parcialmente por el Gobierno Español a través de los proyectos ATTOS (TIN2012-38536-C03-03) y LEGOLANG (TIN2012-31224), la Comisión Europea a través del proyecto SAM (FP7-611312), la Generalitat Valenciana a través del proyecto DIIM2.0 (PROMETEOII/2014/001) y la Universidad de Alicante a través del proyecto emergente “Explotación y tratamiento de la información disponible en Internet para la anotación y generación de textos adaptados al usuario” (GRE13-15)

Repositorio Institucional de la Universidad de Alicante

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Managing summaries for mobile devices

Author: Gutiérrez Yoan
Gómez José M.
Llopis Fernando
Lloret Elena
Martínez-Barco Patricio
Publication venue: Sociedad Española para el Procesamiento del Lenguaje Natural
Publication date: 01/01/2016
Field of study

Los dispositivos móviles han cambiado notablemente la forma en la que los usuarios acceden a la información disponible en Internet. Estos dispositivos permiten un acceso instantáneo desde cualquier lugar, pero tienen una serie de limitaciones importantes sobre los ordenadores personales. Su limitada pantalla, así como en ocasiones la limitada capacidad de recepción de la información, dado el coste, hacen que la selección de información a acceder sea todavía más importante. La generación automática de resúmenes multi-documentos es una alternativa de interés que es el objeto de este artículo. Así en este artículo se presentan y evalúan un modelo de generación automática de resúmenes, junto con un sistema de recuperación de información basado en pasajes.Mobile devices have significantly changed the way users access the information available on Internet. These devices allow instant access anytime and anywhere, but they have a number of important limitations with respect to personal computers. The limited screen space and, sometimes, the limited capacity to receive the information, make the selection of information even more important. Automatic summary generation from multi-document summarization is an interesting alternative which is the subject of this paper. Therefore, in this article is presented and evaluated a model of automatic summarization with an information retrieval system based on passages.Investigación realizada gracias a la financiación de los proyectos: DIIM2.0 (PROMETEOII/2014/001) de la Generalitat Valenciana; TIN2015-65100-R, DIGITY (TIN2015-65136-C2-2-R) del Ministerio de Economía y Competitividad y SAM (FP7-611312) de la Unión Europe

Repositorio Institucional de la Universidad de Alicante

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

GPLSI Wikipedia Characterisation V1.0: Entity Discovery and Linking to Wikipedia

Author: Gutiérrez Yoan
Moreno Isabel
Tomás David
Publication venue
Publication date: 01/06/2017
Field of study

Resumen de la aplicación: GPLSI Wikipedia Characterisation (Descubrimiento y vinculación de entidades a Wikipedia) constituye una interfaz de programación de aplicaciones (API) que incluye librerías de programación útiles para sistemas de terceros. Esta API ofrece la funcionalidad de analizar contenidos textuales para descubrir menciones de entidades y enlazarlas a Wikipedia mediante el uso de DBpedia, su versión estructurada. Como resultado se obtiene una lista de sugerencias de URIs de DBpedia (cada URI se corresponde con una página de Wikipedia) por cada entidad, ordenadas por el grado de confianza (en el intervalo [0,1]). Este grado de confianza se obtiene considerando dos características claves. La primera se corresponde con el número de enlaces entrantes para cada entidad de Wikipedia (más enlaces implica mayor relevancia). La segunda característica es la similitud entre el contexto (lista de palabras adyacentes a la palabra objetivo) de la entidad objetivo y la descripción de cada entidad de Wikipedia. Para este propósito se ha utilizado un algoritmo de desambiguación basado en el paradigma LESK, combinado con estadísticas sobre los enlaces entrantes a las páginas de Wikipedia. Los resultados que alcanza esta tecnología rondan el 70% de F1.GPLSI Wikipedia Characterisation (Entity Discovery and Linking to Wikipedia) is an application programming interface (API) which programming libraries for third-parties. This service allows analysing textual content to discover Wikipedia entities related to that content by means of DBpedia, its structured version. As a result, a list of URIs from DBpedia (each one corresponding to a Wikipedia page) is obtained for each entity, ranked by a confidence score (in the interval [0,1]). This score is obtained considering two key features. The first one is the number of incoming links to the Wikipedia article (more links implies more relevance). The second one is the similarity of the context (list of words adjacent to the target word) of the entity found in text and the description of that entity in Wikipedia. For this purpose, the Lesk disambiguation algorithm has been followed, combined with statistics based on Wikipedia inlinks. The results achieved reflect around 70% of F1.Ministerio de Educación, Cultura y Deporte, Ministerio de Economía y Competitividad (MINECO) proyectos TIN2015-65136-C2-2-R y TIN2015-65100-R, Comisión Europea (SAM project FP7-611312), Gobierno de la Generalitat Valenciana (PROMETEOII/2014/001), Ayudas Fundación BBVA a equipos de investigación científica 2016 (Análisis de Sentimientos Aplicado a la Prevención del Suicidio en las Redes Sociales - ASAP), Universidad de Alicante a través de Proyecto Emergente ("GRE16-01: Plataforma inteligente para recuperación, análisis y representación de la información generada por usuarios en Internet")

Repositorio Institucional de la Universidad de Alicante